STC: Spatio-Temporal Contrastive Learning for Video Instance Segmentation

نویسندگان

چکیده

Video Instance Segmentation (VIS) is a task that simultaneously requires classification, segmentation, and instance association in video. Recent VIS approaches rely on sophisticated pipelines to achieve this goal, including RoI-related operations or 3D convolutions. In contrast, we present simple efficient single-stage framework based the segmentation method CondInst by adding an extra tracking head. To improve accuracy, novel bi-directional spatio-temporal contrastive learning strategy for embedding across frames proposed. Moreover, instance-wise temporal consistency scheme utilized produce temporally coherent results. Experiments conducted YouTube-VIS-2019, YouTube-VIS-2021, OVIS-2021 datasets validate effectiveness efficiency of proposed method. We hope can serve as strong baseline other instance-level video tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Low-Rank Spatio-Temporal Video Segmentation

Robust Principal Component Analysis (RPCA) has generated a great amount of interest for background/foreground estimation in videos. The central hypothesis in this setting is that a video’s background can be well-represented by a low-rank model. However, in the presence of complex lighting conditions this model is only accurate in localised spatio-temporal regions. Following this observation, we...

متن کامل

Spatio-Temporal Segmentation of Video Data

Image segmentation provides a powerful semantic description of video imagery essential in image understanding and efficient manipulation of image data. In particular, segmentation based on image motion defines regions undergoing similar motion allowing image coding system to more efficiently represent video sequences. This paper describes a general iterative framework for segmentation of video ...

متن کامل

Automatic Spatio-Temporal Video Sequence Segmentation

In the paper, an automatic spatio-temporal video sequence segmentation algorithm is proposed. To address this very di cult computer vision problem, several novel algorithms have been developed which use both spatial and temporal information. First, a novel temporal segmentation algorithm is developed based on our previous work in motion estimation. Second, an iterative split-and-merge spatial s...

متن کامل

Video region segmentation by spatio-temporal watersheds

We propose a video region segmentation scheme combining spatio-temporal edges and watershed techniques. We consider the video sequence as a 3-D volume and compute color edges within this volume. These color edges form a vector field that is in turn used to obtain an edge function. This edge function is used as a topological surface for a watershed grouping stage. Considering the video as a 3-D ...

متن کامل

Adversarial Spatio-Temporal Learning for Video Deblurring

Camera shake or target movement often leads to undesired blur effects in videos captured by a hand-held camera. Despite significant efforts being devoted to video-deblur research, two major challenges remain: 1) how to model the spatiotemporal characteristics across both the spatial domain (i.e. image plane) and temporal domain (i.e. neighboring frames), and 2) how to restore sharp image detail...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2023

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-031-25069-9_35